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ABSTRACT 

The tremendous backlog of unanalyzed 
satellite data necessitates the development of 
improved methods for data cataloging and 
analysis. Ford Aerospace has developed an 
image analysis system, SIANN, that 
integrates the technologies necessary to 
satisfy NASA's science data analysis 
requirements for the next generation of 
satellites. SIANN will enable scientists to 
train a neural network to recognize image data 
containing scenes of interest and then rapidly 
search data archives for all such images. The 
approach combines conventional image 
processing technology with recent advances 
in neural networks to provide improved 
classification capabilities. SIANN allows 
users to proceed through a four step process 
of image classification: filtering and 
enhancement, creation of neural network 
training data via application of feature 
extraction algorithms, configuring and 
training a neural network model, and 
classification of images by application of the 
trained neural network. A prototype 
experimentation testbed has been completed 
and applied to climatological data. 

INTRODUCTION 

Data acquired from satellites are essential 
resources in meteorology, agriculture, 
astronomy, forestry, geology, oceanography, 
and many other fields. Cataloging and 
analysis of image data has been a 
fundamental problem for NASA. For 
instance, in 1986 a team of scientists at the 
South Pole took readings overhead and 
learned that the "hole" in the Earth's ozone 
was getting worse. It was later discovered 
that the hole actually showed up in 1976 in 
Nimbus 7 satellite data. Concerning this 
discovery, James L. Green, head of the 
NASA National Space Science Data Center 
stated in (Kneale, 1988), "It’s one of 


probably hundreds of important discoveries 
we have sitting in the basement." To 
compound this problem, the next generation 
of scientific satellites will generate far greater 
amounts of data. 

How will such an enormous database be 
accessed, and how will large amounts of data 
be analyzed? To help provide solutions to 
these questions. Ford Aerospace is 
investigating neural network technology to 
determine how it can provide improved 
satellite image analysis capabilities. A 
prototype system called SIANN (Satellite 
Image Analysis using Neural Networks) has 
been developed which combines convendonal 
image processing techniques with neural 
networks. Currently, SIANN addresses the 
image cataloging problem; that is, the 
generation of summary information, or 
"metadata", from raw image data. The 
metadata are stored in a database which will 
enable scientists to rapidly retrieve images 
containing scenes of interest. 

SIANN is intended to be a general- 
purpose classification system. It will be 
embedded into large satellite data 
management systems and provide a library of 
feature extraction and classification programs 
to support dozens of scientific disciplines. 

Scientists working in different domains 
may be interested in the same data. As such, 
it is necessary to apply a variety of algorithms 
to the raw image data to create the metadata 
that will support queries from multiple 
scientific domains. 

Scientists develop classifiers in SIANN 
by using the following procedure, which is 
illustrated in figure 1: 

1) Select (or develop if necessary) 
feature extraction algorithms which 
are appropriate for the scientific 
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discipline of interest and create a 
training set, T, of patterns 
representative of the desired classes 

2) Configure and train a supervised 
learning neural network to identify the 
desired classes of image scenery 
using the training set T 

3) Test the classifier by applying it to 
novel data; if the results are not 
satisfactory, repeat steps 1 and/or 2 
and then retest. 



Figure 1. The SIANN image classification process. 
Rectangles represent data and ovals represent operations. 


The creation of neural network training 
sets by applying feature extraction algorithms 
has proven successful in a number of 
different applications (Rimey, 1986; Beck, 
1989). Another useful approach is to classify 
individual pixels from multispectral images 
(Campbell, 1989; McClellan, 1989). 

This paper presents initial results of 
SIANN applied to climatological image data. 
First, the feature extraction process is 
described. Next, the neural network 
classification technique is presented. Then an 
experiment is described which analyzes the 
effects that varying the number of training 


set features has on a neural network's 
classification correctness. Finally, 
conclusions are made and directions of future 
work are stated. 

FEATURE EXTRACTION 

(Garand, 88) describes 13 features 
representing height, albedo, shape, and 
multilayering characteristics of cloud fields. 
Table 1 lists 12 of these features (Garand's 
feature for 'Fraction of cloudy pixels with D 
< D r ' was not included.) plus three simple 
statistical features and the 'Number 
background' feature which is a variation of 
the 'Number of clouds' feature. 


Table 1 . Features used to classify climatological data. 
Image source: VIS, visible; IR, infrared; B&W, binary 
corresponding to visible cloud fraction; PS, power 
spectrum. 


Description 

Limits 

1 . Total cloud fraction (IR, VIS) 

0-1 

2. Low cloud fraction (IR) 

0-1 

3 . Middle cloud fraction (IR) 

0-1 

4 . High cloud fraction (IR) 

0-1 

5 . Cloud height of uppermost layer (IR) 

0-14 km 

6. Fraction of cloudy pixels (VIS) 

0-1 

7 . Mean albedo of cloudy pixels (VIS) 

0-1 

8 . Number of clouds (B&W) 

0-m/2 

9. Multilayer index (IR) 

0-1 

10. Background connectivity (B&W) 

0-1 

11. Cloud connectivity (B&W) 

0-1 

12. Streakiness factor (PS) 

0-1 

13. Fraction of spectral intensity 

0-1 

associated with wavelengths 


between 20-40 km (PS) 


14. Minimum pixel value (VIS) 

0-255 

15. Maximum pixel value (VIS) 

0-255 

16. Range of pixel values (VIS) 

0-255 

17. Number background (VIS) 

0-m/2 


Note that Garand's work is solely 
directed at the classification of 20 cloud 
types, without regard to computation time. 
Since classifiers created from SIANN will be 
applied to immense databases, there is 
usually a time/accuracy tradeoff. That is, 
computationally inexpensive features are 
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Figure 2. SIANN enables scientists to create a neural network training set by specifying the desired features and classes, and then selecting image regions representative 
of the classes. 







usually selected over more accurate, but more 
expensive features. Likewise, smaller neural 
networks (i.e., fewer nodes) are preferred to 
larger networks. 

Figure 2 shows the SLANN user interface 
for creating a training set. The first step is to 
enter the desired classes of image scenes into 
the "Classes" box. Next, a set of features are 
selected from the "Features" box. Then 
image regions that are representative of the 
classes are selected as depicted by the starred 
("*") regions in figure 2. Alternatively, odd 
sized regions may be selected in addition to 
the fixed-sized grid regions. Finally, a 
command is issued to generate the feature 
vectors for the selected image regions. The 
"Patterns" box shown in figure 2 contains the 
feature vectors generated from the selected 
image regions (but only for the currently 
selected class in the "Classes" box). Each 
feature vector consist of several real numbers 
typically in the range of 0...1, which are used 
as inputs to the neural network. Each class 
defined by the user is represented by one 
output of the neural network. For the 
remainder of this paper, the training set 
shown in figure 2 shall be referred to as the 
"test" training set. 

NEURAL NETWORK 
CLASSIFICATION 

SLANN uses the popular backpropagation 
algorithm (Rumelhart, 1986). Figure 3 
illustrates the general topology of a back 
propagation network. The bottom layer of 
nodes is the input layer where patterns are 
presented to the network. The top layer 
contains the output nodes which indicate the 
class of the input pattern. Any number of 
internal layers are permitted, but typically one 
is sufficient. (The paper by Ho, 1989 
concludes that it is generally better to increase 
the width of the network than to add layers.) 

Given a training set, SIANN will 
automatically configure and initialize a 
network. Figure 4 describes the network that 
SIANN generated from the test training set. 
(This network will be referred to as the "test" 
network.) Note that the number of input 
nodes matches the number of features and the 
number of output nodes matches the number 


of classes. Each feature value of the input 
patterns is scaled from the corresponding 
limits in table 1 to the range 0.1 to 0.9. 



Figure 3. A back propagation neural network receives 
a pattern in its bottom, input layer and computes the 
pattern's class in the top, output layer. 

4 (lie tdlt Preprocett Segmenl Color 



Figure 4. SIANN automatically configures a back 
propagation neural network to train on a specified training 
set. 


Before training begins, SIANN 
automatically creates unary vectors for the 
target outputs. For the 2-class test network, 
the vectors are: 

Class 0: 0.1 0.9 

Class 1: 0.9 0.1 

After training, the network is tested by 
applying it to new data (i.e., data that it 
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wasn't trained with). Figure 5 shows the 
results of classifying the test image with the 
test network. Misclassifications are denoted 
by an ”Xi" where X means WRONG, and i 
is the class computed by the network. If the 
network does not perform satisfactorily, the 
scientist may modify the training set and/or 
network and then retest. 

EXPERIMENTAL RESULTS 

The data set used contains three 1024 x 
384 8 bit AVHRR (Advanced Very High 
Resolution Radiometer) images of the Indian 
Ocean. Each 8 bit pixel has a footprint of 1 
km 2 . For purposes of this discussion, we 
shall focus on the image shown in figure 2, 
which has been overlaid with the author's 
subjective classification of the picture. (This 
image will be referred to as the "test" image.) 


Table 2. Six training sets were created containing 
from 4 to 9 features. 


Training Set 

Feature 1 2 3 4 5 6 


Mean albedo 
Low cloud fraction 
Middle cloud fraction 
High cloud fraction 
Cloud top height 
Cloud fraction 
Number background 
Maximum pixel value 
Range of pixel values 


An experiment was conducted to 
determine the effects of modifying the set of 
selected features. Six 2-class training sets 
were created, each containing patterns from 
the size 32 2 grid regions lying in the rectangle 
whose upper left grid coordinates are (4, 10) 
and lower right grid coordinates are (7, 20). 
This provided 25 Cloudy patterns and 19 
Clear patterns for each training set. The 
features used in each training set are listed in 
table 2. Each training set was used to train a 
network. Each network had a single internal 
layer of 15 nodes. Equation (1) was used 


during training to modify each weight, where 
the learning rate a = 0.9, and the momentum 
term r\ = 0.7. 

Wjj(t+1) = Wij(t) + (1) 

TiSjXj + a(wjj(t)-wij(t-l)) 

Convergence for each network occurred 
when the maximum error fell below 0.1. 
Table 3 summarizes the results of training the 
networks. The third column specifies how 
many training iterations each network 
required to converge. The fourth column 
lists the CPU time of each training run on a 
VAXstation 3540. Note that when the 
number of features decrease, the time for 
each iteration also decreases since the number 
of nodes in the network is reduced. 


Table 3. Training times increase as the number of 
input features decreases. 


Training 

Set 

Number 

Features 

Number 

Iterations 

Minutes 

1 

4 

1790 

5.6 

2 

5 

2278 

7.8 

3 

6 

711 

2.7 

4 

7 

351 

1.5 

5 

8 

316 

1.3 

6 

9 

456 

2.1 


The nonlinearity of the number of 
iterations and minutes for training runs can be 
attributed to the characteristics of certain 
features. Specifically, it would appear that 
the addition of the 'Cloud top height' feature 
that distinguishes training set 1 from training 
set 2 detracts from the separability of patterns 
within training set 2. Similarly, the addition 
of the 'Range of pixel values' feature detracts 
from the separability of training set 6. 

Each network was tested by applying it to 
all 384 grid regions of the test image. Figure 
6 plots the percentage of misclassified 
regions vs. the number of features. The 
number of classification errors tends to 
decrease when a larger number of input 
features are used. 
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Figure 5. The neural network's classifications of the image regions are compared to the scientists classifications to help refine the classification process. 




Figure 6. Neural network classification errors tend to 
increase as the number of input features decreases. 


Scientists using SIANN can analyze 
training and classification results to select the 
optimal set of features and most efficient 
neural network configuration. 

CONCLUSIONS 

SIANN is an image analysis system that 
combines conventional image processing 
techniques with neural network classifiers. 
Scientists may quickly develop a customized 
classifier using SIANN's menu-driven 
graphical interface. Analysis of the 
classifier's behavior helps the scientist 
improve the classifier by modifying the 
training set features and neural network 
configuration. 

Preliminary testing of the system on 
climatological data has demonstrated that 
neural networks are a viable technique for 
image analysis. SIANN will continue to 
evolve by adding feature extraction programs 
for other scientific domains. Another future 
direction is to investigate unsupervised 
learning neural networks to determine if the 
classifier refinement process can be 
automated; that is, to see if a network can 
discover by itself a good set of features. 

REFERENCES 

Beck, H., McDonald, D., & Brzakovic, D. 
(1989). A self-training visual inspection 
system with a neural network classifier. 
Proceedings from the 1 989 International 


Joint Conference on Neural Networks (pp. 
1-307 - 1-311). Washington, D.C. 

Campbell, W., Hill, S., & Cromp, R. 
(1989). The utilization of neural nets in 
populating an object-oriented database. 
Proceedings from the 1989 Goddard 
Conference on Space Applications of 
Artificial Intelligence, (pp. 249-263). 
Greenbelt, MD. 

Garand, L. (1988). Automated recognition of 
oceanic cloud patterns. Part I: Methodology 
and application to cloud climatology. 
Journal of Climate, 1(1), 20-39. 

Ho, C. (1989). On multi-layered 
connectionist models: Adding layers vs. 
increasing width. Proceedings from the 
Eleventh International Joint Conference on 
Artificial Intelligence, (pp. 176-179). 
Washington, D.C. 

Kneale, D. (January 11, 1988). What 
becomes of data sent back from space? Not 
a lot, as a rule. Wall Street Journal, p. 1. 

McClellan, G., DeWitt, R., Hemmer, T., 
Matheson, L., & Moe, G. (1989). 
Multispectral image-processing with a three- 
layer backpropagation network. 
Proceedings from the 1989 International 
Joint Conference on Neural Networks (pp. 
1-151 - 1-153). Washington, D.C. 

Rimey, R., Gouin, P., Scofield, C., & 
Reilly, D. (1986). Real-time 3-D object 
classification using a learning system. SPIE 
Intelligent Robots: 6th International 
Conference on Robot Vision and Sensory 
Controls, RoViSeC, SPIE Vol. 726. (pp. 
552-557). Cambridge, MA. 

Rumelhart, D., & McClelland, J. (1986). 
Parallel Distributed Processing : 
Explorations in the Microstructure of 
Cognition. Cambridge, MA: MIT Press. 


355 




NASA 

Natonai Aeronautics and 
Space Administration 

1. Report No. 

NASA CP-3068 


Report Documentation Page 


2. Government Accession No. 


4. Title and Subtitle 

1990 Goddard Conference on Space Applications 
of Artificial Intelligence 


7. Author(s) 

James L. Rash, Editor 


3. Recipient's Catalog No. 

5. Report Date 

May 1990 

6. Performing Organization Code 

Code 500 

8. Performing Organization Report No. 

90B00078 

10. Work Unit No. 


9. Performing Organization Name and Address 

Mission Operations & Data Systems Directorate 
NASA/GSFC, Code 500 
Greenbelt, MD 20771 

12. Sponsoring Agency Name and Address 

National Aeronautics & Space Administration 
Washington, DC 20546 


11. Contract or Grant No. 

13. Type of Report and Period Covered 
Conference Publication 

14. Sponsoring Agency Code 

Cede 500 


15. Supplementary Notes 

James L. Rash is associated with Goddard Space Flight Center, Greenbelt, Maryland, 


16. Abstract 

This publication comprises the papers presented at the 1990 Goddard 
Conference on Space Applications of Artificial Intelligence held at the 
NASA/Goddard Space Flight Center, Creenbelt, Maryland on May 1-2, 1990. 
The purpose of this annual conference is to provide a forum in which 
current research and development directed at space applications of 
artificial intelligence can be presented and discussed. The papers in 
this proceedings fall into the following areas i Planning & Scheduling, 
Fault Monitoring/Diagnosis, Image Processing & Machine Vision, Robotics/ 
Intelligent Control, Development Methodologies, Information Management, 
and Knowledge Acquisition. 


17. Key Words (Suggested by Author(s)) . 

Artificial intelligence, expert systems, 

planning and scheduling, fault isolation, 

fault diagnosis, image processing, machine 

vision, data management, robotics, control, 

knowledge acquisition. 

19. Security Classif. (of this report} 20. Security Classif. (of tl 

Unclassified Unclassified 


NASA FORM 1626 OCT 86 


18. Distribution Statement 

Unclassified - unlimited 


20. Security Classif. (of this page) 

21. No. of pages 

Unclassified 

368 


22. Price 
A16 


NASA-L&ngley, 1990 




